-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TierFS enhancements #1008
TierFS enhancements #1008
Conversation
itaiad200
commented
Dec 7, 2020
- Add metrics to pyramid.TierFS
- Single access to block.Adapter during concurrent reads.
- Thread-safe directory creation and deletions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks really good.
Important points:
- No test for concurrent deletes, or for concurrent adds and deletes.
- Monitoring should put hits and misses in same metric.
if err := os.Remove(dir); err != nil { | ||
return err | ||
} | ||
dir = parentDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop can continue way up, above the planned location. I think directory
needs a concept of a ceilingDir
or firstAncestorDir
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning for rooted workspace
dir to back me up, but it's indeed flaky. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better, but still fragile! If I can control the inputs to removeFromLocal
to include ..
, then this loop happily steps above it.
E.g., if ceilingDir
is /a/b/data
and I convince the FS to work with a file named x/../../c/d/e/f
, then I can happily destroy many things. Right now this is not severe because we do not allow user-controlled paths into the tiers. But e.g. in the (alternate!) future where we afford users more control over the final location of their files, we might end up with user-controlled paths.
I would be happier if we at least documented that paths are not clean and must never be user-controlled. Or we could resolve to actual paths and work from there. E.g. @nopcoder suggested filepath.Clean
to resolve a somewhat similar case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding documentation for now, really can't see how the paths (other than the base directory) will be user-controlled in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I don't particularly mind "user-controlled" paths when that user is the owner of lakeFS. I'm worried about "user-controlled" paths when they come from a lakeFS user, who is supposed only to have permissions to act on files inside lakeFS.
Thanks!
pyramid/file.go
Outdated
func (f *File) Sync() error { | ||
return f.fh.Sync() | ||
} | ||
|
||
func (f *File) Close() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This existence of this function makes me doubt that the change at line 10 is safe. Specifically, it makes pyramid.File
have a safe conversion to File
. So now I can
func logAndClose(f *os.File) {
log("closing!")
f.Close()
}
func unsafe(f *pyramid.File) {
// use f...
logAndClose(f.File)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixed!
pyramid/tier_fs_test.go
Outdated
@@ -17,7 +23,7 @@ import ( | |||
|
|||
var ( | |||
fs FS | |||
adapter block.Adapter | |||
adapter *memAdapter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just avoid global state and create its test with its own adapter? As is we cannot run tests in parallel, and we need weird checks on the gets
counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Concurrency testing is still incomplete, it seems hard to synchronize all threads of execution to reach the single decision of allowing just one to complete. Perhaps see how @ozkatz did it in ChanLocker
?
pyramid/metrics.go
Outdated
prometheus.HistogramOpts{ | ||
Name: "tier_fs_eviction_bytes", | ||
Help: "TierFS evicted object size by bytes", | ||
Buckets: []float64{0.5 * kb, 1 * kb, 16 * kb, 32 * kb, 128 * kb, 512 * kb, 1 * mb, 2 * mb, 4 * mb, 8 * mb, 16 * mb, 64 * mb}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^^^^?
if err := os.Remove(dir); err != nil { | ||
return err | ||
} | ||
dir = parentDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better, but still fragile! If I can control the inputs to removeFromLocal
to include ..
, then this loop happily steps above it.
E.g., if ceilingDir
is /a/b/data
and I convince the FS to work with a file named x/../../c/d/e/f
, then I can happily destroy many things. Right now this is not severe because we do not allow user-controlled paths into the tiers. But e.g. in the (alternate!) future where we afford users more control over the final location of their files, we might end up with user-controlled paths.
I would be happier if we at least documented that paths are not clean and must never be user-controlled. Or we could resolve to actual paths and work from there. E.g. @nopcoder suggested filepath.Clean
to resolve a somewhat similar case.
Codecov Report
@@ Coverage Diff @@
## master #1008 +/- ##
==========================================
+ Coverage 46.26% 46.33% +0.07%
==========================================
Files 152 153 +1
Lines 12318 12342 +24
==========================================
+ Hits 5699 5719 +20
- Misses 5895 5898 +3
- Partials 724 725 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! Sorry for the delayed approval, I forgot about this one.
if err := os.Remove(dir); err != nil { | ||
return err | ||
} | ||
dir = parentDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I don't particularly mind "user-controlled" paths when that user is the owner of lakeFS. I'm worried about "user-controlled" paths when they come from a lakeFS user, who is supposed only to have permissions to act on files inside lakeFS.
Thanks!